Lviv University
Overview
Question: how the data layers are located in relation to each other?
Note
spatial queries.topological spatial relations
DE-9IM
Dimensionally Extended 9-Intersection Model (DE-9IM).Point objects are 0-dimensionalLineString and LinearRing are 1-dimensionalPolygon objects are 2-dimensional.Interior, boundary and exterior for different geometric data types. The data types can be either 0, 1 or 2-dimensional.
Spatial predicates
When testing how two geometries relate to each other, the DE-9IM model gives a result which is called spatial predicate (also called as binary predicate).
There are plenty of topological relations: altogether 512 with 2D data.
Eight common spatial predicates formed based on spatial relations between two geometries. Modified after Egenhofer et al. (1992).
Types
Making spatial queries in Python
In Python, all the basic spatial predicates are available from shapely library, including:
.intersects().within().contains().overlaps().touches().covers().covered_by().equals().disjoint().crosses()Making spatial queries in Python: within()
Create a couple of Point objects and one Polygon object which we can use to test how they relate to each other:
Making spatial queries in Python
We can check the contents of the new variables by printing them to the screen, for example, in which case we would see
POINT (24.952242 60.1696017)
POINT (24.976567 60.16125)
POLYGON ((24.950899 60.169158, 24.953492 60.169158, 24.95351 60.170104, 24.950958 60.16999, 24.950899 60.169158))
If you want to test whether these Point geometries stored in point1 and point2 are within the polygon, you can call the .within() method as follows:
Making spatial queries in Python
One of the most common spatial queries is to see if a geometry intersects or touches another one. Again, there are binary operations in shapely for checking these spatial relationships:
.intersects() - Two objects intersect if the boundary or interior of one object intersect in any way with the boundary or interior of the other object..touches() - Two objects touch if the objects have at least one point in common and their interiors do not intersect with any part of the other object.Making spatial queries in Python
Let’s try these by creating two LineString geometries and test whether they intersect and touch each other:
Making spatial queries in Python
Making spatial queries in Python
If the lines are fully overlapping with each other they don’t touch due to the spatial relationship rule in the DE-9IM. We can confirm this by checking if line_a touches itself:
No it doesn’t. However, .intersects() and .equals() should produce True for a case when we compare the line_a with itself:
Making spatial queries in Python
The following prints results for all predicates between the point1 and the polygon which we created earlier:
print("Intersects?", point1.intersects(polygon))
print("Within?", point1.within(polygon))
print("Contains?", point1.contains(polygon))
print("Overlaps?", point1.overlaps(polygon))
print("Touches?", point1.touches(polygon))
print("Covers?", point1.covers(polygon))
print("Covered by?", point1.covered_by(polygon))
print("Equals?", point1.equals(polygon))
print("Disjoint?", point1.disjoint(polygon))
print("Crosses?", point1.crosses(polygon))Intersects? True
Within? True
Contains? False
Overlaps? False
Touches? False
Covers? False
Covered by? True
Equals? False
Disjoint? False
Crosses? False
within vs contains
.within() the polygon..contains() the point.Spatial queries using geopandas
Let’s check which points are located within specific areas of Helsinki:
Shape: (34, 1)
geometry
0 POINT (24.91556 60.1632)
1 POINT (24.93166 60.16905)
2 POINT (24.94168 60.16996)
3 POINT (24.97865 60.19005)
4 POINT (24.92151 60.15662)
Shape: (23, 3)
Name Description geometry
18 Koivukylä POLYGON Z ((24.99423 60.33296 0, 25.00007 60.3...
19 Itäinen POLYGON Z ((25.03517 60.23627 0, 25.03585 60.2...
20 Östersundom POLYGON Z ((25.23352 60.25655 0, 25.23744 60.2...
21 Hakunila POLYGON Z ((25.08472 60.27248 0, 25.08492 60.2...
22 Korso POLYGON Z ((25.1238 60.34191 0, 25.11997 60.34...
The data contains 34 address points and 23 district polygons.
Spatial queries using geopandas
Let’s find all points that are within two areas in Helsinki region, namely Itäinen and Eteläinen (‘Eastern’ and ‘Southern’ in English).
Spatial queries using geopandas
Let’s now plot the layers on top of each other. The areas with red color represent the districts that we want to use for testing the spatial relationships against the point layer (shown with blue color):
Spatial queries using geopandas
We should check which Points in the points GeoDataFrame are within the selected polygons stored in the selection geodataframe.
We will use a spatial join as an efficient way to conduct spatial queries in geopandas.
to_frame()
Using to_frame() allows us to avoid attaching any extra attributes from the selection geodataframe to our data, which is what .sjoin() method would normally do (and which it is actually designed for). .geometry.to_frame() will:
selection layerGeoDataFrame (which would otherwise be a GeoSeries).An alternative approach for doing the same thing is to use selection[[selection.active_geometry_name]], which also returns a GeoDataFrame containing only a column with the geodataframe’s active geometry.
sjoin() examples
By default, the .sjoin() uses "intersects" as a spatial predicate, but it is easy to change this. For example, we can investigate which of the districts contain at least one point.
sjoin() examples
To find all possible spatial predicates for a given GeoDataFrame you can call:
{None,
'contains',
'contains_properly',
'covered_by',
'covers',
'crosses',
'dwithin',
'intersects',
'overlaps',
'touches',
'within'}
What is .sindex?
sindex
SpatialIndex object contains the spatial index for our data.
geopandas uses a spatial index called R-tree which is a hierarchical, tree-like structure that divides the space into nested, overlapping rectangles and indexes the bounding boxes of each geometry..sjoin() instead of directly calling .within(), .contains() that come with the shapely geometries (as shown previously).Exercise
How many addresses are located in each district? You can find out the answer by grouping the spatial join result based on the district name.
Description
Spatial join retrieves table attributes from one layer and transfers them into another layer based on their spatial relationship. For example, one can join the attributes of a polygon layer into a point layer where each point would get the attributes of a polygon that intersects with the point.
It is good to remember that spatial join is always conducted between two layers at a time.
Spatial join allows you to combine attribute information from multiple layers based on spatial relationship.
sjoin details
In spatial join, there are two set of options that you can control, which ultimately influence how the data is transferred between the layers:
Spatial predicates
True will be kept in the result.sjoin type
The other parameter that you can use to control how the spatial join is conducted is the spatial join type. There are three different join types that influence the outcome of the spatial join:
inner joinleft outer joinright outer joinSpatial join with Python
Spatial join can be done easily with geopandas using the .sjoin() method. Next, we will learn how to use this method to perform a spatial join between two layers:
addresses which are the locations that we geocoded previously;population grid which is a 250m x 250m grid polygon layer that contains population information from the Helsinki Region (source: Helsinki Region Environmental Services Authority).Spatial join with Python
Let’s start by reading the data:
| geometry | |
|---|---|
| 0 | POINT (24.91556 60.1632) |
| 1 | POINT (24.93166 60.16905) |
As we can see, the addresses variable contains address Points which represent a selection of public transport stations in the Helsinki Region.
Spatial join with Python
| id | inhabitants | occupancy_rate | geometry | |
|---|---|---|---|---|
| 0 | Vaestotietoruudukko_2021.1 | 5 | 50.60 | POLYGON ((25472499.995 6689749.005, 25472499.9... |
| 1 | Vaestotietoruudukko_2021.2 | 7 | 36.71 | POLYGON ((25472499.995 6685998.998, 25472499.9... |
The pop_grid dataset contains few columns, namely a unique id, the number of inhabitants per grid cell, and the occupancy_rate as percentage.
Preparations for spatial join
The basic requirement for a successful spatial join is that the layers should overlap with each other in space. If the geometries between the layers do not share the same CRS, it is very likely that the spatial join will fail and produces an empty GeoDataFrame.
Address points CRS: None
Population grid CRS: ETRS89 / GK25FIN
To fix this issue, let’s reproject the geometries in the addresses GeoDataFrame to the same CRS as pop_grid using the .to_crs() method.
True
Preparations for spatial join
As a last preparatory step, let’s visualize both datasets on top of each other to see how the inhabitants are distributed over the region, and how the address points are located in relation to the grid:

Join the layers based on spatial relationship
The aim here is to get information about
How many people live in a given polygon that contains an individual address-point?
Thus, we want to join the attribute information from the pop_grid layer into the addresses Point layer using the .sjoin() method.
| geometry | index_right | id | inhabitants | occupancy_rate | |
|---|---|---|---|---|---|
| 0 | POINT (25495311.608 6672258.695) | 3262 | Vaestotietoruudukko_2021.3263 | 505 | 14.01 |
| 1 | POINT (25496206.216 6672909.016) | 3381 | Vaestotietoruudukko_2021.3382 | 172 | 27.67 |
| 2 | POINT (25496762.723 6673010.538) | 3504 | Vaestotietoruudukko_2021.3505 | 43 | 61.44 |
| 3 | POINT (25498815.415 6675246.744) | 3845 | Vaestotietoruudukko_2021.3846 | 757 | 33.98 |
| 4 | POINT (25495641.151 6671525.076) | 3310 | Vaestotietoruudukko_2021.3311 | 1402 | 29.76 |
| 5 | POINT (25504528.607 6680282.118) | 5058 | Vaestotietoruudukko_2021.5059 | 283 | 20.51 |
| 6 | POINT (25506082.816 6678702.5) | 5407 | Vaestotietoruudukko_2021.5408 | 155 | 43.42 |
| 7 | POINT (25501631.384 6685110.943) | 4302 | Vaestotietoruudukko_2021.4303 | 236 | 37.62 |
| 8 | POINT (25501586.789 6683452.706) | 4309 | Vaestotietoruudukko_2021.4310 | 253 | 33.56 |
| 10 | POINT (25496896.718 6673162.114) | 3504 | Vaestotietoruudukko_2021.3505 | 43 | 61.44 |
| 11 | POINT (25493506.585 6679793.504) | 2985 | Vaestotietoruudukko_2021.2986 | 629 | 36.11 |
| 12 | POINT (25493093.193 6680589.107) | 2886 | Vaestotietoruudukko_2021.2887 | 817 | 29.63 |
| 13 | POINT (25497118.126 6678784.269) | 3535 | Vaestotietoruudukko_2021.3536 | 163 | 33.08 |
| 14 | POINT (25500717.139 6682045.953) | 4114 | Vaestotietoruudukko_2021.4115 | 99 | 33.06 |
| 15 | POINT (25494134.034 6678278.391) | 3079 | Vaestotietoruudukko_2021.3080 | 425 | 37.04 |
| 16 | POINT (25492575.043 6681972.681) | 2775 | Vaestotietoruudukko_2021.2776 | 353 | 32.35 |
| 17 | POINT (25498089.949 6679640.055) | 3728 | Vaestotietoruudukko_2021.3729 | 255 | 38.51 |
| 19 | POINT (25492287.623 6679040.234) | 2723 | Vaestotietoruudukko_2021.2724 | 231 | 33.98 |
| 20 | POINT (25499646.711 6681218.844) | 3945 | Vaestotietoruudukko_2021.3946 | 336 | 35.39 |
| 22 | POINT (25504344.005 6677452.185) | 5008 | Vaestotietoruudukko_2021.5009 | 453 | 21.73 |
| 23 | POINT (25507495.008 6677173.504) | 5626 | Vaestotietoruudukko_2021.5627 | 231 | 35.18 |
| 24 | POINT (25504169.174 6679129.384) | 4942 | Vaestotietoruudukko_2021.4943 | 638 | 33.41 |
| 25 | POINT (25506082.506 6680577.96) | 5399 | Vaestotietoruudukko_2021.5400 | 361 | 24.29 |
| 26 | POINT (25497840.365 6675020.358) | 3696 | Vaestotietoruudukko_2021.3697 | 829 | 33.92 |
| 27 | POINT (25501570.947 6675734.321) | 4330 | Vaestotietoruudukko_2021.4331 | 271 | 37.29 |
| 28 | POINT (25500377.732 6675098.523) | 4079 | Vaestotietoruudukko_2021.4080 | 321 | 38.40 |
| 29 | POINT (25497199.344 6674065.972) | 3552 | Vaestotietoruudukko_2021.3553 | 215 | 43.71 |
| 30 | POINT (25496286.911 6672913.768) | 3411 | Vaestotietoruudukko_2021.3412 | 286 | 25.86 |
| 31 | POINT (25496128.995 6672625.288) | 3382 | Vaestotietoruudukko_2021.3383 | 1409 | 32.37 |
| 32 | POINT (25495624.409 6671766.187) | 3309 | Vaestotietoruudukko_2021.3310 | 995 | 37.07 |
| 33 | POINT (25497062.747 6673226.621) | 3555 | Vaestotietoruudukko_2021.3556 | 281 | 36.14 |
Join the layers based on spatial relationship
inhabitants and occupancy_rateindex_right and id_right which tell the index and id of the matching polygon in the right-side member of the spatial joinid column in the left-side member of the spatial join was renamed as id_left._left and _right are appended to the column names to differentiate the columns in cases where there are identical column names present in both GeoDataFrames.Join the layers based on spatial relationship
Let’s also visualize the joined output. In the following, we plot the points using the inhabitants column to indicate the color:
Join the layers based on spatial relationship
It is useful to investigate if we lost any data while doing the spatial join. Let’s check this by comparing the number of rows in our result to how many addresses we had originally:
We can investigate where the points outside of polygons are located:
Join the layers based on spatial relationship
We might want to keep the information for the points that did not get a match based on the spatial relationship.
| geometry | index_right | id | inhabitants | occupancy_rate | |
|---|---|---|---|---|---|
| 0 | POINT (25495311.608 6672258.695) | 3262.0 | Vaestotietoruudukko_2021.3263 | 505.0 | 14.01 |
| 1 | POINT (25496206.216 6672909.016) | 3381.0 | Vaestotietoruudukko_2021.3382 | 172.0 | 27.67 |
| 2 | POINT (25496762.723 6673010.538) | 3504.0 | Vaestotietoruudukko_2021.3505 | 43.0 | 61.44 |
| 3 | POINT (25498815.415 6675246.744) | 3845.0 | Vaestotietoruudukko_2021.3846 | 757.0 | 33.98 |
| 4 | POINT (25495641.151 6671525.076) | 3310.0 | Vaestotietoruudukko_2021.3311 | 1402.0 | 29.76 |
| 5 | POINT (25504528.607 6680282.118) | 5058.0 | Vaestotietoruudukko_2021.5059 | 283.0 | 20.51 |
| 6 | POINT (25506082.816 6678702.5) | 5407.0 | Vaestotietoruudukko_2021.5408 | 155.0 | 43.42 |
| 7 | POINT (25501631.384 6685110.943) | 4302.0 | Vaestotietoruudukko_2021.4303 | 236.0 | 37.62 |
| 8 | POINT (25501586.789 6683452.706) | 4309.0 | Vaestotietoruudukko_2021.4310 | 253.0 | 33.56 |
| 9 | POINT (25492851.783 6678869.234) | NaN | NaN | NaN | NaN |
| 10 | POINT (25496896.718 6673162.114) | 3504.0 | Vaestotietoruudukko_2021.3505 | 43.0 | 61.44 |
| 11 | POINT (25493506.585 6679793.504) | 2985.0 | Vaestotietoruudukko_2021.2986 | 629.0 | 36.11 |
| 12 | POINT (25493093.193 6680589.107) | 2886.0 | Vaestotietoruudukko_2021.2887 | 817.0 | 29.63 |
| 13 | POINT (25497118.126 6678784.269) | 3535.0 | Vaestotietoruudukko_2021.3536 | 163.0 | 33.08 |
| 14 | POINT (25500717.139 6682045.953) | 4114.0 | Vaestotietoruudukko_2021.4115 | 99.0 | 33.06 |
| 15 | POINT (25494134.034 6678278.391) | 3079.0 | Vaestotietoruudukko_2021.3080 | 425.0 | 37.04 |
| 16 | POINT (25492575.043 6681972.681) | 2775.0 | Vaestotietoruudukko_2021.2776 | 353.0 | 32.35 |
| 17 | POINT (25498089.949 6679640.055) | 3728.0 | Vaestotietoruudukko_2021.3729 | 255.0 | 38.51 |
| 18 | POINT (25496358.8 6676198.28) | NaN | NaN | NaN | NaN |
| 19 | POINT (25492287.623 6679040.234) | 2723.0 | Vaestotietoruudukko_2021.2724 | 231.0 | 33.98 |
| 20 | POINT (25499646.711 6681218.844) | 3945.0 | Vaestotietoruudukko_2021.3946 | 336.0 | 35.39 |
| 21 | POINT (25501142.787 6681208.443) | NaN | NaN | NaN | NaN |
| 22 | POINT (25504344.005 6677452.185) | 5008.0 | Vaestotietoruudukko_2021.5009 | 453.0 | 21.73 |
| 23 | POINT (25507495.008 6677173.504) | 5626.0 | Vaestotietoruudukko_2021.5627 | 231.0 | 35.18 |
| 24 | POINT (25504169.174 6679129.384) | 4942.0 | Vaestotietoruudukko_2021.4943 | 638.0 | 33.41 |
| 25 | POINT (25506082.506 6680577.96) | 5399.0 | Vaestotietoruudukko_2021.5400 | 361.0 | 24.29 |
| 26 | POINT (25497840.365 6675020.358) | 3696.0 | Vaestotietoruudukko_2021.3697 | 829.0 | 33.92 |
| 27 | POINT (25501570.947 6675734.321) | 4330.0 | Vaestotietoruudukko_2021.4331 | 271.0 | 37.29 |
| 28 | POINT (25500377.732 6675098.523) | 4079.0 | Vaestotietoruudukko_2021.4080 | 321.0 | 38.40 |
| 29 | POINT (25497199.344 6674065.972) | 3552.0 | Vaestotietoruudukko_2021.3553 | 215.0 | 43.71 |
| 30 | POINT (25496286.911 6672913.768) | 3411.0 | Vaestotietoruudukko_2021.3412 | 286.0 | 25.86 |
| 31 | POINT (25496128.995 6672625.288) | 3382.0 | Vaestotietoruudukko_2021.3383 | 1409.0 | 32.37 |
| 32 | POINT (25495624.409 6671766.187) | 3309.0 | Vaestotietoruudukko_2021.3310 | 995.0 | 37.07 |
| 33 | POINT (25497062.747 6673226.621) | 3555.0 | Vaestotietoruudukko_2021.3556 | 281.0 | 36.14 |
Join the layers based on spatial relationship
Let’s investigate a bit more to see which rows did not have a matching polygon in the population grid.
Exercise
Do the spatial join another way around, i.e. make a spatial join where you join information from the address points into the population grid. How does the result differ from the version where we joined information from the grids to the points? What would be the benefit of doing the join this way around?
Description
First Law of Geography
Everything is related to everything, but near things are more related than distant things.
The basic idea of finding a nearest neighbour based on geographic distance.
Why limit?
Overview
Typical tasks:
GeoDataFrame based on Pointobjects from another GeoDataFramescipy library to find K-Nearest Neighbors (KNN) with Point data.Practical example
Where is the closest public transport stop from my place of residence?
geopandas, we can find nearest neighbors for all geometries in a given GeoDataFrame using the .sjoin_nearest() method.Number of stops: 8377
| stop_name | stop_lat | stop_lon | stop_id | geometry | |
|---|---|---|---|---|---|
| 0 | Ritarihuone | 60.16946 | 24.95667 | 1010102 | POINT (24.95667 60.16946) |
| 1 | Kirkkokatu | 60.17127 | 24.95657 | 1010103 | POINT (24.95657 60.17127) |
sjoin_nearest
.sjoin() methodspatial index called STRTree which is an efficient implementation of the R-tree dynamic index structure for spatial searching.Reproject
Let’s start by reprojecting our latitude and longitude values into a metric system using the national EUREF-FIN coordinate reference system (EPSG code 3067) for Finland:
Use sjoin_nearest
Because we are interested to find the closest stop geometry for each building, the buildings GeoDataFrame is on the left hand side of the command.
Note that we give a name for a column which is used to store information about the distance between a given building and the closest stop (this is optional):
CPU times: user 0 ns, sys: 1 μs, total: 1 μs
Wall time: 1.19 μs
| name | geometry | index_right | stop_name | stop_lat | stop_lon | stop_id | distance | |
|---|---|---|---|---|---|---|---|---|
| 0 | None | POINT (381166.6 6676424.438) | 1131 | Muusantori | 60.20749 | 24.857450 | 1304138 | 92.679893 |
| 1 | Uimastadion | POINT (385236.565 6674238.472) | 467 | Auroran sairaala | 60.19145 | 24.925540 | 1171122 | 400.243370 |
| 2 | None | POINT (386317.478 6672100.648) | 61 | Senaatintori | 60.16901 | 24.950460 | 1020450 | 109.819633 |
| 3 | Hartwall Arena | POINT (385225.109 6676120.56) | 532 | Veturitie | 60.20661 | 24.929680 | 1174112 | 104.632434 |
| 4 | Talli | POINT (385079.733 6676989.745) | 496 | Posti 1 | 60.21345 | 24.917550 | 1172143 | 472.248282 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 158726 | None | POINT (373896.051 6677472.204) | 4621 | Samis | 60.21369 | 24.720970 | 3170209 | 195.675552 |
| 158727 | None | POINT (372425.65 6676945.528) | 4654 | Yrjö Liipolan tie | 60.20922 | 24.695470 | 3170244 | 137.137640 |
| 158728 | None | POINT (374696.625 6677972.738) | 4655 | Kandidaatintie | 60.21818 | 24.736987 | 3170245 | 135.341745 |
| 158729 | Granhultsskolan | POINT (373287.582 6677731.639) | 4624 | Uimahalli | 60.21638 | 24.711260 | 3170212 | 99.408108 |
| 158730 | Kauniaisten kirkko | POINT (374112.695 6677330.017) | 4665 | Postitori | 60.21267 | 24.728250 | 3170257 | 67.790422 |
159818 rows × 8 columns
sjoin_nearest
distance_col="distance".index_right provides information about the index number of the closest stop in the stops GeoDataFrame.buildings GeoDataFrame, the distance between the building and two (or more) stops have been exactly the same (i.e. they are equidistant).sjoin_nearest() will store both records into the results by duplicating the building information and attaching information from the stops into separate rows accordingly.sjoin_nearest
We can make computations faster by specifying a max_distance parameter that specifies the maximum search distance:
CPU times: user 1e+03 ns, sys: 1 μs, total: 2 μs
Wall time: 1.91 μs
| name | geometry | index_right | stop_name | stop_lat | stop_lon | stop_id | distance | |
|---|---|---|---|---|---|---|---|---|
| 0 | None | POINT (381166.6 6676424.438) | 1131 | Muusantori | 60.207490 | 24.857450 | 1304138 | 92.679893 |
| 10 | None | POINT (384645.078 6669763.917) | 592 | Hernesaaren laituri | 60.148287 | 24.923281 | 1204101 | 57.786201 |
| 12 | None | POINT (384782.782 6669707.017) | 595 | Hernesaaren laituri | 60.148680 | 24.924240 | 1204108 | 79.844881 |
| 13 | None | POINT (384714.47 6669710.887) | 592 | Hernesaaren laituri | 60.148287 | 24.923281 | 1204101 | 32.640335 |
| 16 | None | POINT (385040.806 6670639.517) | 596 | Hernesaarenkatu | 60.156110 | 24.930370 | 1204109 | 87.888087 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 158718 | None | POINT (374219.973 6677006.1) | 4651 | Kauniaisten asema | 60.210830 | 24.729330 | 3170240 | 69.803673 |
| 158719 | None | POINT (374231.494 6676967.402) | 4642 | Kauniaistentie | 60.209810 | 24.731510 | 3170230 | 63.384115 |
| 158720 | None | POINT (374602.815 6677396.18) | 4673 | Raamattuopisto | 60.213524 | 24.736685 | 3170265 | 56.594370 |
| 158729 | Granhultsskolan | POINT (373287.582 6677731.639) | 4624 | Uimahalli | 60.216380 | 24.711260 | 3170212 | 99.408108 |
| 158730 | Kauniaisten kirkko | POINT (374112.695 6677330.017) | 4665 | Postitori | 60.212670 | 24.728250 | 3170257 | 67.790422 |
40128 rows × 8 columns
Connect with a line
index_right column in our table and conduct a normal table join using the .merge() methodgeometry columns from the stops GeoDataFrame because all the other attributes already exist in our results:| name | geometry_x | index_right | stop_name | stop_lat | stop_lon | stop_id | distance | geometry_y | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | None | POINT (381166.6 6676424.438) | 1131 | Muusantori | 60.20749 | 24.85745 | 1304138 | 92.679893 | POINT (381256.66 6676446.317) |
| 1 | Uimastadion | POINT (385236.565 6674238.472) | 467 | Auroran sairaala | 60.19145 | 24.92554 | 1171122 | 400.243370 | POINT (384973.331 6674539.973) |
| 2 | None | POINT (386317.478 6672100.648) | 61 | Senaatintori | 60.16901 | 24.95046 | 1020450 | 109.819633 | POINT (386277.25 6671998.462) |
| 3 | Hartwall Arena | POINT (385225.109 6676120.56) | 532 | Veturitie | 60.20661 | 24.92968 | 1174112 | 104.632434 | POINT (385255.784 6676220.595) |
| 4 | Talli | POINT (385079.733 6676989.745) | 496 | Posti 1 | 60.21345 | 24.91755 | 1172143 | 472.248282 | POINT (384607.679 6677003.267) |
Connect with a line
LineString between the buildings and the closest stops, we can use the linestrings() function of the shapely libraryPoint objects stored in the geometry_x and geometry_y columns, we use the .get_coordinates() method of geopandas that returns the x and y coordinates as Series objects/columns.numpy arrays using the to_numpy() method which we pass to the linestrings() function.LineStrings into a column geometry which we set as the active geometry of the GeoDataFrame:| name | geometry_x | index_right | stop_name | stop_lat | stop_lon | stop_id | distance | geometry_y | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | None | POINT (381166.6 6676424.438) | 1131 | Muusantori | 60.20749 | 24.85745 | 1304138 | 92.679893 | POINT (381256.66 6676446.317) | LINESTRING (381166.6 381256.66, 6676424.438 66... |
| 1 | Uimastadion | POINT (385236.565 6674238.472) | 467 | Auroran sairaala | 60.19145 | 24.92554 | 1171122 | 400.243370 | POINT (384973.331 6674539.973) | LINESTRING (385236.565 384973.331, 6674238.472... |
| 2 | None | POINT (386317.478 6672100.648) | 61 | Senaatintori | 60.16901 | 24.95046 | 1020450 | 109.819633 | POINT (386277.25 6671998.462) | LINESTRING (386317.478 386277.25, 6672100.648 ... |
| 3 | Hartwall Arena | POINT (385225.109 6676120.56) | 532 | Veturitie | 60.20661 | 24.92968 | 1174112 | 104.632434 | POINT (385255.784 6676220.595) | LINESTRING (385225.109 385255.784, 6676120.56 ... |
| 4 | Talli | POINT (385079.733 6676989.745) | 496 | Posti 1 | 60.21345 | 24.91755 | 1172143 | 472.248282 | POINT (384607.679 6677003.267) | LINESTRING (385079.733 384607.679, 6676989.745... |
Connect with a line
Let’s create a map that visualizes the buildings, stops and the connecting lines between the buildings and the closest stops in a single figure:
Connect with a line
It is easy to do some descriptive analysis based on this data providing information about levels of access to public transport in the region:
Nearest neighbors with Polygon and LineString data
Let’s find the closest urban park to building polygons in a neighborhood called Kamppi, which is located in Helsinki, Finland. Then, we’ll find the closest drivable road (LineString) to each building.
| osmid | building | name | geometry | |
|---|---|---|---|---|
| 0 | 11711721042 | yes | Nice Bike Pyörähuolto | POINT (384966.661 6671503.786) |
| 1 | 8035238 | public | Lasipalatsi | POLYGON ((385459.65 6672184.469, 385456.356 66... |
| 2 | 8042297 | yes | Radisson Blu Royal | POLYGON ((385104.154 6671916.693, 385101.584 6... |
| 3 | 14797170 | school | None | POLYGON ((384815.326 6671762.71, 384815.792 66... |
| 4 | 14797171 | yes | None | POLYGON ((384797.759 6671853.253, 384798.253 6... |
| ... | ... | ... | ... | ... |
| 450 | 8092998 | yes | None | POLYGON ((384747.465 6671811.996, 384744.27 66... |
| 451 | 8280536 | apartments | None | POLYGON ((384839.007 6671934.815, 384839.485 6... |
| 452 | 8525159 | civic | None | POLYGON ((385495.275 6672164.009, 385494.928 6... |
| 453 | 8525161 | civic | None | POLYGON ((385486.225 6672173.653, 385486.717 6... |
| 454 | 8535506 | civic | None | POLYGON ((385481.13 6672167.861, 385482.372 66... |
455 rows × 4 columns
Nearest neighbors with Polygon and LineString data
We find the nearest park for each building Polygon and store the distance into the column distance:
| osmid_left | building | name_left | geometry | index_right | osmid_right | leisure | name_right | distance | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 11711721042 | yes | Nice Bike Pyörähuolto | POINT (384966.661 6671503.786) | 12 | 1227991181 | park | Kaartin lasaretin puisto | 100.208527 |
| 1 | 8035238 | public | Lasipalatsi | POLYGON ((385459.65 6672184.469, 385456.356 66... | 1 | 8042613 | park | Simonpuistikko | 16.284929 |
| 2 | 8042297 | yes | Radisson Blu Royal | POLYGON ((385104.154 6671916.693, 385101.584 6... | 8 | 37390082 | park | None | 40.039501 |
| 3 | 14797170 | school | None | POLYGON ((384815.326 6671762.71, 384815.792 66... | 5 | 26999855 | park | None | 0.000000 |
| 4 | 14797171 | yes | None | POLYGON ((384797.759 6671853.253, 384798.253 6... | 5 | 26999855 | park | None | 14.873403 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 450 | 8092998 | yes | None | POLYGON ((384747.465 6671811.996, 384744.27 66... | 5 | 26999855 | park | None | 70.819624 |
| 451 | 8280536 | apartments | None | POLYGON ((384839.007 6671934.815, 384839.485 6... | 8 | 37390082 | park | None | 38.574646 |
| 452 | 8525159 | civic | None | POLYGON ((385495.275 6672164.009, 385494.928 6... | 1 | 8042613 | park | Simonpuistikko | 32.792083 |
| 453 | 8525161 | civic | None | POLYGON ((385486.225 6672173.653, 385486.717 6... | 1 | 8042613 | park | Simonpuistikko | 90.919207 |
| 454 | 8535506 | civic | None | POLYGON ((385481.13 6672167.861, 385482.372 66... | 1 | 8042613 | park | Simonpuistikko | 87.821936 |
455 rows × 9 columns
Nearest neighbors with Polygon and LineString data
In a similar manner, we can also find the nearest road from each building as follows:
| osmid_left | building | name_left | geometry | index_right | osmid_right | name_right | highway | distance | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 11711721042 | yes | Nice Bike Pyörähuolto | POINT (384966.661 6671503.786) | 182 | [126894680, 126894676, 126894678, 126894679] | Eerikinkatu | residential | 11.181066 |
| 0 | 11711721042 | yes | Nice Bike Pyörähuolto | POINT (384966.661 6671503.786) | 24 | [126894680, 126894676, 126894678, 126894679] | Eerikinkatu | residential | 11.181066 |
| 1 | 8035238 | public | Lasipalatsi | POLYGON ((385459.65 6672184.469, 385456.356 66... | 15 | [42574048, 42574049, 28920739, 77891210, 26999... | Arkadiankatu | secondary | 52.015824 |
| 1 | 8035238 | public | Lasipalatsi | POLYGON ((385459.65 6672184.469, 385456.356 66... | 33 | [42574048, 42574049, 28920739, 77891210, 26999... | Arkadiankatu | secondary | 52.015824 |
| 2 | 8042297 | yes | Radisson Blu Royal | POLYGON ((385104.154 6671916.693, 385101.584 6... | 83 | [37135576, 8035726, 37135575] | Salomonkatu | residential | 6.659959 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 452 | 8525159 | civic | None | POLYGON ((385495.275 6672164.009, 385494.928 6... | 107 | 51707742 | Yrjönkatu | residential | 88.553223 |
| 453 | 8525161 | civic | None | POLYGON ((385486.225 6672173.653, 385486.717 6... | 15 | [42574048, 42574049, 28920739, 77891210, 26999... | Arkadiankatu | secondary | 90.569914 |
| 453 | 8525161 | civic | None | POLYGON ((385486.225 6672173.653, 385486.717 6... | 33 | [42574048, 42574049, 28920739, 77891210, 26999... | Arkadiankatu | secondary | 90.569914 |
| 454 | 8535506 | civic | None | POLYGON ((385481.13 6672167.861, 385482.372 66... | 15 | [42574048, 42574049, 28920739, 77891210, 26999... | Arkadiankatu | secondary | 96.128437 |
| 454 | 8535506 | civic | None | POLYGON ((385481.13 6672167.861, 385482.372 66... | 33 | [42574048, 42574049, 28920739, 77891210, 26999... | Arkadiankatu | secondary | 96.128437 |
703 rows × 9 columns
Nearest neighbors with Polygon and LineString data
To better understand how the spatial join between the buildings and roads have been conducted, we can again visualize the nearest neighbors with a straight line. To do this, we first bring the geometries from the roads GeoDataFrame into the same table with the buildings:
| osmid_left | building | name_left | geometry_x | index_right | osmid_right | name_right | highway | distance | geometry_y | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 11711721042 | yes | Nice Bike Pyörähuolto | POINT (384966.661 6671503.786) | 182 | [126894680, 126894676, 126894678, 126894679] | Eerikinkatu | residential | 11.181066 | LINESTRING (385040.141 6671566.384, 385034.832... |
| 0 | 11711721042 | yes | Nice Bike Pyörähuolto | POINT (384966.661 6671503.786) | 24 | [126894680, 126894676, 126894678, 126894679] | Eerikinkatu | residential | 11.181066 | LINESTRING (384942.149 6671500.856, 384950.743... |
| 1 | 8035238 | public | Lasipalatsi | POLYGON ((385459.65 6672184.469, 385456.356 66... | 15 | [42574048, 42574049, 28920739, 77891210, 26999... | Arkadiankatu | secondary | 52.015824 | LINESTRING (385285.226 6672266.801, 385296.799... |
Nearest neighbors with Polygon and LineString data
We can use shortest_line() from the shapely library that returns a LineString object between the input geometries showing the shortest distance between them.
from shapely import shortest_line
# Generate LineString between nearest points of two geometries
connectors = nearest_roads.apply(
lambda row: shortest_line(row["geometry_x"], row["geometry_y"]), axis=1
)
# Create a new GeoDataFrame out of these geometries
connectors = gpd.GeoDataFrame({"geometry": connectors}, crs=roads.crs)
connectors["distance"] = connectors.length
connectors.head()| geometry | distance | |
|---|---|---|
| 0 | LINESTRING (384966.661 6671503.786, 384960.444... | 11.181066 |
| 0 | LINESTRING (384966.661 6671503.786, 384960.444... | 11.181066 |
| 1 | LINESTRING (385487.966 6672217.975, 385460.972... | 52.015824 |
| 1 | LINESTRING (385487.966 6672217.975, 385460.972... | 52.015824 |
| 2 | LINESTRING (385050.507 6671936.92, 385046.795 ... | 6.659959 |
Nearest neighbors with Polygon and LineString data
We can visualize the buildings, roads and these connectors to better understand the exact points where the distance between a given building and the closest road is shortest:
Exercise
parks and roads GeoDataFrames and follow the approaches presented above to find the closest road to each park.